Extending the punctuation module for european portuguese
نویسندگان
چکیده
This paper describes our recent work on extending the punctuation module of automatic subtitles for Portuguese Broadcast News. The main improvement was achieved by the use of prosodic information. This enabled the extension of the previous module which covered only full stops and commas, to cover question marks as well. The approach uses lexical, acoustic and prosodic information. Our results show that the latter is relevant for all types of punctuation. An analysis of the results also shows what type of interrogative is better dealt with by our method, taking into account the specificities of Portuguese. This may lead to different results for different types of corpora, depending on the types of interrogatives that are more frequent.
منابع مشابه
Recovering capitalization and punctuation marks for automatic speech recognition: Case study for Portuguese broadcast news
The following material presents a study about recovering punctuation marks, and capitalization information from European Portuguese broadcast news speech transcriptions. Different approaches were tested for capitalization, both generative and discriminative, using: finite state transducers automatically built from language models; and maximum entropy models. Several resources were used, includi...
متن کاملModules whose direct summands are FI-extending
A module $M$ is called FI-extending if every fully invariant submodule of $M$ is essential in a direct summand of $M$. It is not known whether a direct summand of an FI-extending module is also FI-extending. In this study, it is given some answers to the question that under what conditions a direct summand of an FI-extending module is an FI-extending module?
متن کامل$PI$-extending modules via nontrivial complex bundles and Abelian endomorphism rings
A module is said to be $PI$-extending provided that every projection invariant submodule is essential in a direct summand of the module. In this paper, we focus on direct summands and indecomposable decompositions of $PI$-extending modules. To this end, we provide several counter examples including the tangent bundles of complex spheres of dimensions bigger than or equal to 5 and certain hyper ...
متن کاملA relative extending module and torsion precovers
We first characterize $tau$-complemented modules with relative (pre)-covers. We also introduce an extending module relative to $tau$-pure submodules on a hereditary torsion theory $tau$ and give its relationship with $tau$-complemented modules.
متن کاملRecovering Capitalization and Punctuation Marks on Speech Transcriptions
This work addresses two metadata annotation tasks, involved in the production of rich transcripts: automatic capitalization, and punctuation marks recovery. The main focus concerns broadcast news, using both manual and automatic speech transcripts. Different capitalization models were analysed and compared, and results support the ideia that generative approaches capture the structure of writte...
متن کامل